Gene Ontology (GO) Prediction using Machine Learning Methods

نویسندگان

  • Haoze Wu
  • Yangyu Zhou
چکیده

We applied machine learning to predict whether a gene is involved in axon regeneration. We extracted 31 features from different databases and trained five machine learning models. Our optimal model, a Random Forest Classifier with 50 submodels, yielded a test score of 85.71%, which is 4.1% higher than the baseline score. We concluded that our models have some predictive capability. Similar methodology and features could be applied to predict other Gene Ontology (GO) terms.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Discriminative local subspaces in gene expression data for effective gene function prediction

MOTIVATION Massive amounts of genome-wide gene expression data have become available, motivating the development of computational approaches that leverage this information to predict gene function. Among successful approaches, supervised machine learning methods, such as Support Vector Machines (SVMs), have shown superior prediction accuracy. However, these methods lack the simple biological in...

متن کامل

The effects of shared information on semantic calculations in the gene ontology

The structured vocabulary that describes gene function, the gene ontology (GO), serves as a powerful tool in biological research. One application of GO in computational biology calculates semantic similarity between two concepts to make inferences about the functional similarity of genes. A class of term similarity algorithms explicitly calculates the shared information (SI) between concepts th...

متن کامل

Gene Ontology-driven inference of protein-protein interactions using inducers

MOTIVATION Protein-protein interactions (PPIs) are pivotal for many biological processes and similarity in Gene Ontology (GO) annotation has been found to be one of the strongest indicators for PPI. Most GO-driven algorithms for PPI inference combine machine learning and semantic similarity techniques. We introduce the concept of inducers as a method to integrate both approaches more effectivel...

متن کامل

FFPred: an integrated feature-based function prediction server for vertebrate proteomes

One of the challenges of the post-genomic era is to provide accurate function annotations for large volumes of data resulting from genome sequencing projects. Most function prediction servers utilize methods that transfer existing database annotations between orthologous sequences. In contrast, there are few methods that are independent of homology and can annotate distant and orphan protein se...

متن کامل

Technical supplement to “ Consistent probabilistic outputs for protein function prediction ”

Protein function prediction, in the context of the Gene Ontology, is a task that consists of answering, for a fixed protein X, a large number of binary questions of the form: " Does protein X belong to GO term Y ? " Those binary classification problems are strongly related because the ontology consists of nested classes. Two natural requirements for this prediction problem are • that the set of...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1711.00001  شماره 

صفحات  -

تاریخ انتشار 2017